Lambda Phage analysis to determine conversion efficiency
Read data location
/Volumes/NGS\ Drive/NGS\ Raw\ Data/Oyster_gonad_Bisulfite/full/filtered_174gm_A_NoIndex_L006_R1.fastq.gz
/Volumes/NGS\ Drive/NGS\ Raw\ Data/Oyster_gonad_Bisulfite/full/filtered_174gm_A_NoIndex_L006_R2.fastq.gz
code
./bsmap -a //Volumes/NGS\ Drive/NGS\ Raw\ Data/Oyster_gonad_Bisulfite/full/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/NGS\ Drive/NGS\ Raw\ Data/Oyster_gonad_Bisulfite/full/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/cnidarian/phage_lambda_genome.fasta -o /Volumes/web/cnidarian/BiGO_BSMAP_Gonad_lambda.sam -p 2 -n 1
*Note*
-n [0,1] set mapping strand information:
-n 0: only map to 2 forward strands, i.e. BSW(++) and BSC(-+) (i.e. the "Lister protocol")
for PE sequencing, map read#1 to ++ and -+, read#2 to +- and --.
-n 1: map SE or PE reads to all 4 strands, i.e. ++, +-, -+, -- (i.e. the "Cokus protocol")
default: -n 0. Most bisulfite sequencing data is generated only from forward strands.
Input reference file: /Volumes/web/cnidarian/phage_lambda_genome.fasta (format: FASTA)
Load in 1 db seqs, total size 48502 bp. 0 secs passed
total_kmers: 43046721
Create seed table. 2 secs passed
max number of mismatches: read_length * 8% max gap size: 0
kmer cut-off ratio:5e-07
max multi-hits: 100 max Ns: 5 seed size: 16 index interval: 4
quality cutoff: 0 base quality char: '!'
min fragment size:28 max fragemt size:500
start from read #1 end at read #4294967295
additional alignment: T in reads => C in reference
mapping strand (read_1): ++,-+,+-,--
mapping strand (read_2): +-,--,++,-+
Pair-end alignment(2 threads)
Input read file #1: //Volumes/NGS Drive/NGS Raw Data/Oyster_gonad_Bisulfite/full/filtered_174gm_A_NoIndex_L006_R1.fastq.gz (format: gzipped FASTQ)
Input read file #2: /Volumes/NGS Drive/NGS Raw Data/Oyster_gonad_Bisulfite/full/filtered_174gm_A_NoIndex_L006_R2.fastq.gz (format: gzipped FASTQ)
Output file: /Volumes/web/cnidarian/BiGO_BSMAP_Gonad_lambda.sam (format: SAM)
Total number of aligned reads:
pairs: 189393 (0.11%)
single a: 7462 (0.0043%)
single b: 7845 (0.0046%)
Done.
Run SAM through methratio script
python methratio.py -d /Volumes/web/cnidarian/phage_lambda_genome.fasta -u -z -g -o /Volumes/web/cnidarian/BiGO_methratio_lambda_B.txt -s /Volumes/Bay3/Software/BSMAP/bsmap-2.73/samtools /Volumes/web/cnidarian/BiGO_BSMAP_Gonad_lambda.sam
Options:
-h, --help show this help message and exit
-o FILE, --out=FILE output file name. (required)
-d FILE, --ref=FILE reference genome fasta file. (required)
-c CHR, --chr=CHR process only specified chromosomes, separated by ','.
[default: all] example: --chroms=chr1,chr2
-s PATH, --sam-path=PATH
path to samtools. [default: none]
-u, --unique process only unique mappings/pairs.
-p, --pair process only properly paired mappings.
-z, --zero-meth report loci with zero methylation ratios.
-q, --quiet don't print progress on stderr.
-r, --remove-duplicate
remove duplicated reads.
-t N, --trim-fillin=N
trim N end-repairing fill-in nucleotides. [default: 2]
-g, --combine-CpG combine CpG methylaion ratios on both strands.
-m FOLD, --min-depth=FOLD
report loci with sequencing depth>=FOLD. [default: 1]
-n, --no-header don't print a header line
-i CT_SNP, --ct-snp=CT_SNP
how to handle CT SNP ("no-action", "correct", "skip"),
default: "correct".
total 394093 valid mappings, 21067 covered cytosines, average coverage: 272.88 fold.
python methratio.py -d /Volumes/web/cnidarian/phage_lambda_genome.fasta -u -g -p -o /Volumes/web/cnidarian/BiGO_methratio_lambda_C.txt -s /Volumes/Bay3/Software/BSMAP/bsmap-2.73/samtools /Volumes/web/cnidarian/BiGO_BSMAP_Gonad_lambda.sam
total 378786 valid mappings, 21062 covered cytosines, average coverage: 259.85 fold.
python methratio.py -d /Volumes/web/cnidarian/phage_lambda_genome.fasta -u -p -o /Volumes/web/cnidarian/BiGO_methratio_lambda_D.txt -s /Volumes/Bay3/Software/BSMAP/bsmap-2.73/samtools /Volumes/web/cnidarian/BiGO_BSMAP_Gonad_lambda.sam
total 378786 valid mappings, 24173 covered cytosines, average coverage: 226.40 fold.
New BSMAP with alternative -n
code
./bsmap -a //Volumes/NGS\ Drive/NGS\ Raw\ Data/Oyster_gonad_Bisulfite/full/filtered_174gm_A_NoIndex_L006_R1.fastq.gz -b /Volumes/NGS\ Drive/NGS\ Raw\ Data/Oyster_gonad_Bisulfite/full/filtered_174gm_A_NoIndex_L006_R2.fastq.gz -d /Volumes/web/cnidarian/phage_lambda_genome.fasta -o /Volumes/web/cnidarian/BiGO_BSMAP_Gonad_lambda_v2.sam -p 3
Input reference file: /Volumes/web/cnidarian/phage_lambda_genome.fasta (format: FASTA)
Load in 1 db seqs, total size 48502 bp. 0 secs passed
total_kmers: 43046721
Create seed table. 2 secs passed
max number of mismatches: read_length * 8% max gap size: 0
kmer cut-off ratio:5e-07
max multi-hits: 100 max Ns: 5 seed size: 16 index interval: 4
quality cutoff: 0 base quality char: '!'
min fragment size:28 max fragemt size:500
start from read #1 end at read #4294967295
additional alignment: T in reads => C in reference
mapping strand (read_1): ++,-+
mapping strand (read_2): +-,--
Pair-end alignment(3 threads)
Input read file #1: //Volumes/NGS Drive/NGS Raw Data/Oyster_gonad_Bisulfite/full/filtered_174gm_A_NoIndex_L006_R1.fastq.gz (format: gzipped FASTQ)
Input read file #2: /Volumes/NGS Drive/NGS Raw Data/Oyster_gonad_Bisulfite/full/filtered_174gm_A_NoIndex_L006_R2.fastq.gz (format: gzipped FASTQ)
Output file: /Volumes/web/cnidarian/BiGO_BSMAP_Gonad_lambda_v2.sam (format: SAM)
Total number of aligned reads:
pairs: 189394 (0.11%)
single a: 7460 (0.0043%)
single b: 7844 (0.0046%)
*Note*
Changing -n value does not significantly alter number of aligned reads.
python methratio.py -d /Volumes/web/cnidarian/phage_lambda_genome.fasta -u -g -p -o /Volumes/web/cnidarian/BiGO_methratio_lambda_M.txt -s /Volumes/Bay3/Software/BSMAP/bsmap-2.73/samtools /Volumes/web/cnidarian/BiGO_BSMAP_Gonad_lambda_v2.sam
------
Run methratio in order to determine conversion efficiency
python methratio.py -d /Volumes/web/cnidarian/phage_lambda_genome.fasta -u -z -g -p -o /Volumes/web/cnidarian/BiGO_methratio_lambda_N.txt -s /Volumes/Bay3/Software/BSMAP/bsmap-2.73/samtools /Volumes/web/cnidarian/BiGO_BSMAP_Gonad_lambda_v2.sam
@ Tue Apr 16 10:26:09 2013: reading reference /Volumes/web/cnidarian/phage_lambda_genome.fasta ...
@ Tue Apr 16 10:26:09 2013: reading /Volumes/web/cnidarian/BiGO_BSMAP_Gonad_lambda_v2.sam ...
[samopen] SAM header is present: 1 sequences.
@ Tue Apr 16 10:26:33 2013: combining CpG methylation from both strands ...
@ Tue Apr 16 10:26:33 2013: writing /Volumes/web/cnidarian/BiGO_methratio_lambda_N.txt ...
@ Tue Apr 16 10:26:33 2013: done.
total 378776 valid mappings, 21062 covered cytosines, average coverage: 259.84 fold.